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A theory of additive Markov chains with long-range memory is used for description of correlation 
properties of coarse-grained literary texts. The complex structure of the correlations in texts is 
revealed. Antipersistent correlations at small distances, L < 300, and persistent ones at L > 300 
define this nontrivial structure. For some concrete examples of literary texts, the memory functions 
are obtained and their power-law behavior at long distances is disclosed. This property is shown to 
be a cause of self-similarity of texts with respect to the decimation procedure. 
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INTRODUCTION 

The problem of long-range correlated stochastic dy- 
namic systems (LRSCS) has been under study for a loa 



time in many areas of contemporary physics 
H @ .biology H S EH EJ Q > economics H |14[ 
etc. |s ll5| ■ An important examples of complex LRSCS 
are literary texts [S [H H H H3 ■ 

One of the ways to get a correct insight into the na- 
ture of correlations in a symbolic system consists in con- 
structing a mathematical object (for example, a corre- 
lated sequence of symbols) possessing the same statisti- 
cal properties as the initial dynamic system. There exist 
many algorithms for generating long-range correlated se- 
quences: the inverse Fourier transformation [l5l l2l|. the 
expansion-modification Li method l22j. the Voss proce- 
dure of consequent random additions [23 , the correlated 
Levy walks j2J|; etc. |l5|. We believe that, of the above- 
mentioned methods, the use of the many-step Markov 
chains is one of the most important because it allows con- 
structing random sequences with prescribed correlation 
properties in the most natural way. This was demon- 
strated in Ref. [2fj| . where the Markov chains with the 
step-like memory function (MF) were studied. It was 
shown that there exist some dynamical systems (coarse- 
grained sequences of the Eukarya's DNA and dictionar- 
ies) with correlation properties that can be properly de- 
scribed by this model. 

The many-step Markov chain is the sequence of sym- 
bols of some alphabet constructed using a conditional 
probability function, which determines the probability 
of occurring some definite symbol of sequence depend- 
ing on N previous ones. The property of additivity of 
Markov chain means the independent influence of differ- 
ent previous symbols on generated one. The concept of 
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additivity, primarily introduced in paper [2fJ, was later 
generalized for the case of binary non- stationary Markov 
chains j2|| . Another generalization was based on consid- 
eration of Markov sequences with a many-valued alpha- 
bet mm 

The efficient method for investigating into the LRSCS 
systems consists in decomposing the space of states into a 
finite number of parts labelled by definite symbols, which 
are naturally ordered according to the dynamics of the 
system. The most frequently used decomposition pro- 
cedure is based on the introduction of two parts of the 
phase space. In other words, the approach presupposes 
mapping two kinds of states into two symbols, say and 
1. This procedure is often referred to as coarse grain- 
ing. Thus, the problem is reduced to investigating the 
statistical properties of binary sequences. 

It might be thought that the coarse graining could re- 
sult in losing, at least, the short-range memory in the 
sequence. The authors of Ref. argued that the map- 
ping of a given sequence into a small-alphabet sequence 
does not necessarily imply that the long-range correla- 
tions presented in the initial text would be preserved. 
However, as was shown in Ref. [2(J, the statistical proper- 
ties of coarse-grained texts depend, but not significantly, 
on the kind of mapping. This implies that only the small 
part of all possible kinds of mapping can destroy the ini- 
tial correlations in the system. Below, we demonstrate 
that the coarse graining retains, although not completely, 
the correlations at all distances. This means that there is 
no point in coding every symbol (associating every part 
of the phase space of the system with its binary code) 
to analyze the correlation properties of the dynamic sys- 
tems, as it is done, for example, in Ref. 18], but it is 
sufficient to use the coarse-graining procedure. 

In the present work, we study the coarse-grained lit- 
erary texts examining them as additive Markov chains. 
A recently obtained equation [2jj connecting mutually- 
complementary characteristics of these sequence, the 
memory and correlation functions, is used. Once the 
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memory function of the original sequence is found from 
the analysis of the correlation function, we construct the 
corresponding Markov chain with the same statistical 
properties. This method for constructing the sequence 
of elements with a given correlation function seems to 
be very important for other applications, e.g., it can be 
employed to fabricate the effective filters of electrical or 
optical signals [sol ]. 

We show that the memory function of any coarse- 
grained literary text is characterized by a complex struc- 
ture because of the competition between two kinds of 
correlations. One type of correlations works at short 
distances, L < 300. The corresponding MF is nega- 
tive, which reflects the anti- persistent nature of such 
correlations. Other type of correlations with the posi- 
tive memory function acts at long distances, L > 300. 
The strength of these persistent correlations decreases as 
a power-law function. We demonstrate that the power- 
law decrease of the memory function results in the self- 
similarity phenomenon in the coarse-grained texts with 
respect to the decimation procedure. 

The paper is organized as follows. In the next Sec- 
tion, we introduce some general relations for the additive 
Markov chains and present an equation connecting the 
correlation and memory functions. Section III contains 
the application of the concept of additive Markov chains 
to literary works. In Conclusion, we summarize the ob- 
tained results. 

MATHEMATICAL MODEL 
Markov Processes 

Let us consider a homogeneous binary sequence of sym- 
bols, a,i — {0,1}. To determine the N-step Markov 
chain we have to introduce the conditional probabil- 
ity P{a,i | aj-jV) o-i—N+ii ■ ■ ■ , d-t-i) of occurring the def- 
inite symbol a, (for example, a, = 1) after N-wovd 
T/v,i, where Tjv^ stands for the sequence of symbols 
ai_j\r, aj-jv+i, • • • ; a>i— l- Thus, it is necessary to define 
2 N values of the P-function corresponding to each possi- 
ble configuration of the symbols in iV-word T/v,j. Since 
we will apply our theory to the sequences with long mem- 
ory lengthes of the order of 10 6 , some special restrictions 
to the class of P-functions should be imposed. We con- 
sider the MF of the additive form, 

N 

P(a i = l\T N , i )=Y / f(ai-r,r). (1) 

r=l 

Here the function /(aj_ r , r) describes the additive contri- 
bution of the symbol a^_ r to the conditional probability 
of occurring the symbol unity, dj = 1, at the ith site. The 
homogeneity of the Markov chain is provided by indepen- 
dence of the conditional probability (JTJ of the index i. It 



is possible to consider Eq. (JTJ as the first term in expan- 
sion of conditional probability in the formal series, where 
each term corresponds to the additive (unary), binary, 
ternary, and so on functions up to the iV-ary one. 
Let us rewrite Eq. Q in an equivalent form, 

N 

P{a t = 1 | T Nti ) =a + J2 F(r)(ai-r - a), (2) 

with 

N 

E/(0,r) 



[l-E(/(l,r)-/(0,r))] 

r=l 

and 

F(r)=/(l,r)-/(0,r). 

We refer to F(r) as the memory function (MF). It 
describes the strength of influence of previous symbol 
<2i_ r (r = l,...,N) upon a generated one, a%. It can 
be shown that a coincides with the value of czj averaged 
over the whole sequence. To the best of our knowledge, 
basically the concept of the memory function for many- 
step Markov chains was originally used in Refs. |2fJ, l25| 
where they are well suited to describe the LRSCS. 

The memory function F(r) contains complete informa- 
tion about the correlation properties of the Markov chain. 
Usually, the correlation function and other moments are 
employed as the input characteristics describing the cor- 
related random systems. However, the correlation func- 
tion describes not only the direct interconnection of ele- 
ments a.i and ai+ r , but also takes into account their in- 
direct interaction via other intermediate elements. Our 
approach operates with the " origin" characteristics of the 
system, specifically with the memory function. This al- 
lows one to disclose the fundamental intrinsic properties 
of the system which provide the correlations between the 
elements. 

A sequence of symbols in a Markov chain can be 
thought of as the sequence of states of some parti- 
cle, which participates in a correlated Brownian motion. 
Thus, every L-word (a set of consequent symbols of the 
length L) can be considered as one of the realizations of 
the ensemble of correlated Brownian trajectories in the 
"time" interval L. The positive values of the MF result in 
persistent diffusion where previous displacements of the 
Brownian particle in some direction provoke its conse- 
quent displacement in the same direction. The negative 
values of the MF correspond to the antipersistent dif- 
fusion where the changes in the direction of motion are 
more probable. Another physical system, the Ising chain 
of spins with long-range interactions, could also be asso- 
ciated with the Markov sequence for which the positive 
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values of the MF correspond to the attraction of spins 
whereas the negative ones conform to the repulsion. 

Below we will use some more statistical characteristics 
of the random sequences. We consider the distribution 
Wl (k) of the words of definite length L by the number 

L 

k of unities in them, = a i+u an d the variance 



D(L), 



D(L) = {k-k) 2 , 



(3) 



where the average g{k) is defined as g(k) = 

L 

^2 g(k)WL{k). Another important value is the corre- 

fc=0 

lation function, 



K(r) = a t a t . 



K(0) = 5(1 - a). 



(4) 



By definition, the correlation function is even, K(r) — 
K(\r\). It is connected with the above mentioned vari- 
ance by the equation [2fij |. 



K(r) = -(D(r-l)-2D(r) + D(r + l)), (5) 



„. , ld 2 D(r) 

in the continuous limit. 

The memory function used in Refs. [13, H3| is char- 
acterized by a step-like behavior and is defined by two 
parameters only: the memory depth N and the strength 
/ of symbol's correlations. The value of / was assumed 
to be independent of the distance r between the sym- 
bols at r < N. This memory function was employed 
to describe the long-range persistent properties of the 
coarse-grained literary texts, specifically, the super-linear 
dependence of the variance D(L). However, it does not 
reflect the antipersistent behavior of D(L) (observed in 
Refs. 0|) at short distances. Obviously, we need a more 
complex memory function for detailed description of the 
both short-range and long-range properties of the coarse- 
grained texts. 



Equation for the memory function 

We suggest two methods for finding the memory func- 
tion F(r) of the Markov chain m that possess the same 
correlation function as a given random sequence bi. The 
first one is based on the minimization of a " distance" , 
Dist, between the Markov chain generated by means of 
a sought-for MF and the initial sequence bi. This dis- 
tance is determined by a formula, 



with P- function Equating the variational derivative 
5Dist/SF(r) to zero, we get the following relation be- 
tween the memory function F(r) and the correlation one 
K{r): 



N 



K{r) = ^F{r')K{r-r')i r > 1. 



(8) 



Equation JHJ can also be obtained by a straightforward 
calculation of expression ai<ii+ r in Eq. using defini- 
tion (J2J of the memory function. 

The second method resulting from the first one estab- 
lishes a relationship between the memory function F(r) 
and the variance D(L), 



N 



M(r,0) = ^F(r')M(r,r'), r > 1, 



(9) 



Dist=(bi-P(bi = l \T N ^)y 



(J) 



M(r,r') = D(r-r')-(D(-r') + r[D(-r' + l)-D(-r')]). 

It is a set of linear equations for F(r) with coeffi- 
cients M(r,r') determined by D(r). Equations JSJ and 
D{—r) = D(r) are used here. 

The function K(r), being a second derivative of D(r), 
is less manageable and robust in computer simulations. It 
is the reason why we prefer to use the second method (jHJl ■ 
This is our instrument for finding the memory function 
F(r) of a sequence using the known variance D(L). The 
robustness of the method in the numerical simulations 
was demonstrated in Ref. 1291. 



LITERATURE TEXTS VIEWED AS THE 
MARKOV CHAINS 

Variance and correlation function 

Let us apply our method to the investigation into 
the correlation properties of the coarse-grained literary 
texts. At the outset, we examine the variance D(L) of 
the coarse-grained text of the King James Version of the 
Bible [U . The result of the numerical simulation is pre- 
sented by solid line in Fig. ^ The straight dotted line 
describes the variance D a (L) = La(\ — a), which corre- 
sponds to the non-correlated biased Brownian diffusion. 
One of the typical coarse-graining procedure was used for 
mapping the letters of the text onto the symbols zero and 
unity, ((a — m) 0, (n — z) \— ► 1). It is clearly seen that 
the diffusion is antipersistent at small distances, L < 300, 
(see inset) whereas it is persistent at long distances pjjf . 
The deviation of the solid line from the dotted one testi- 
fies to the existence of the correlations in the text of the 
Bible. To confirm this statement we break down the orig- 
inal text into subsequences of a given length L = 3000 
and randomly shuffle them. The results from the calcu- 
lation of the variance for the coarse-grained initial and 
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FIG. 1: The variance D(L) for the coarse-grained text (let- 
ters (a — m) i— ► 0, letters (n — z) 1) of the Bible (solid 
line) and the Markov chain generated by means of the re- 
constructed memory function F(r) (filled circles). The coin- 
cidence of these curves proves the robustness of our method 
of the MF reconstruction. The dotted straight line describes 
the non-correlated Brownian diffusion, Do(L) — La(l — a). 
The inset demonstrates the antipersistent dependence of the 
dimensionless ratio D(L) / Do(L) upon L at short distances. 



shuffled texts of the Bible are given in Fig. [21 For L <C Lq, 
the difference in D(L) is negligible |3^|. At L ~ Lq, the 
variance and correlation function of the shuffled sequence 
are less than the original ones. At L > Lq, the corre- 
lations in the shuffled text vanish. In this region, the 
variance D(L) is a linear function, and the correlation 
function being the second derivative of variance equals 
to zero. 

It is case to show that the correlation function of the 
shuffled sequence can be written as, 



x(r) = j*o(r)(l-£), 



r < L , 
r > L , 



(10) 



where Kq{t) is the correlation function of the original 
non-shuffled sequence. The corresponding variance ob- 
tained by double numeric integration (see Eq. ©) of the 
function K(r) given by Eq. (|10() is shown in Fig. 01 by 
solid line. 

Along with the global characteristic D(L), it is inter- 
esting to study its local analogue, 



D t (L) =< (k- < k > La f > Lo , 



(11) 



where Lq is the interval of local averaging and I is the 
coordinate of the left border of this interval. An exis- 
tence of a trend in the dependence Di(L) on / would 
be clearly indicative of non-stationarity of the stochastic 
process being studied. To verify the stationarity of the 
coarse-grained text of the Bible we perform the numerical 
simulation of the Di(L) dependence on I at different fixed 
values of L. As an example, the result of this simulation 




FIG. 2: The variance D(L) for the coarse-grained text ((let- 
ters with even numbers in the Alphabet) i— > 1, (ones with odd 
numbers) i— > 0) of the Bible (dashed line) and for the sequence 
obtained by shuffling the blocs of the length Lo = 3000 (filled 
circles). The solid line represents the analytical results, ob- 
tained with Eq. 11011 . The dotted straight line describes the 
non-correlated Brownian diffusion, Do(L) — La(l — a). 
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FIG. 3: The local variance Di(lO) for the coarse-grained text 
of the Bible vs the distance I. The averaging interval is Lo = 
10 5 . 



for L = 10 is shown in Fig.|3 It is clearly seen that there 
exist regular fluctuations without a pronounced trend. 
The fluctuations result from the fmitcness of interval Lq 
of averaging. This fact allows us to make a conclusion 
about stationarity of the coarse-grained text of the Bible. 
The similar analysis of many other texts gave the same 
result. It is expedient to study the global characteristics 
D(L), Eq. Js3, °f the sequence instead of the local one, 
D t (L). 



Memory function 

According to Eqs. (jHJ and ©, the memory function 
can be restored using the variance or the correlation func- 
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tion. The MF thus obtained for the coarse-grained text of 
the Bible at r < 300 is given in Fig.0J At long distances, 
r > 300, the memory function can be nicely approxi- 
mated by the power function F(r) = 0.27r , which is 
shown by the solid line in the inset in Fig. 0] Note that 
the persistent part of the MF, F{r > 300) < 0.0008, is 
much less than its typical magnitude 0.02 in the antiper- 
sistcnt region r < 40. 
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FIG. 4: The memory function F(r) for the coarse-grained text 
of the Bible at short distances. The power-law decreasing 
portion of the F(r) plot for the Bible is presented by filled 
circles in the inset. The solid line corresponds to the power- 
law fitting. 

It should be emphasized that the short-range part of 
the memory function at r < 40, as well as the D(L) 
function at L < 300, is essentially dependent on the 
method of coarse-graining. Nevertheless, the antiper- 
sistent correlations exist for practically all kinds of the 
coarse-graining procedure. An interesting feature is that 
the region r < 40 of negative antipersistent memory func- 
tion provides much longer distances L ~ 300 of antiper- 
sistent behavior of the variance D(L). 

In order to prove the universal character of the power- 
law decrease of the memory function at long distances, 
we compare the MF of the coarse-grained texts for more 
than fifty different literary works. The texts are coarse- 
grained by mapping the letters from the first and second 
halves of the alphabet into zero and unity, respectively. 
Subsequently, using Eq. we first calculate the vari- 
ances and then the memory functions. All curves for 
the memory functions can be well fitted by the power- 
law functions F(r) = cr~ b . The results of the fit ting for 
eight texts written or translated into Russian [34l l35j are 
shown in Fig.|SJ The exponents in all curves vary over the 
interval between b m i n = 1.02 for "War and Peace" and 
Knax = 1-56 for the Koran. Thus, the constants c and b 
can be used for linguistic classification of different literary 
works. It is interesting to see that the memory functions 
for the texts of the English- and Russian-worded Bible, 
as well as the texts of the Old and New Testaments are 



practically coincident. 
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FIG. 5: The memory function at long distances for the coarse- 
grained texts of eight literary works: 1. The Bible, 2. "Oliver 
Twist" by Charles Dickens, 3. "War and Peace" by Leo Tol- 
stoy, 4. The Tora, 5. "Master and Margarita" by Mikhail Bul- 
gakov, 6. "Don Quixote" by Miguel de Servantes, 7. "Oblo- 
mov" by Ivan Goncharov, 8. The Koran. 

The existence of two characteristic regions having dif- 
ferent behavior of the memory function and, correspond- 
ingly, of the persistent and antipersistent portions in the 
D(L) dependence appears to be a prominent feature of 
all texts in any language. Note that the antipersistent 
portion of the memory function corresponds to the re- 
gion where the grammatical rules are in use. Therefore, 
we call this kind of correlations the " grammatical" ones. 
The persistent correlations in a text at very long dis- 
tances can be related to a general idea of the literary 
work. Thus, this kind of correlations is referred to as the 
"semantic" ones. 

Two fundamentally different portions in the MF plots 
result from a peculiar competition between the two 
above-mentioned kinds of correlations. We would like 
to stress that both portions of the MF are equally im- 
portant to gain an insight into the correlation proper- 
ties of the literary texts. To support this statement we 
generate two special sequences. In both of them, only 
one kind of the memory function for the coarse-grained 
text of the Bible is taken into account, and the mem- 
ory function in another region is assumed to be zero. 
The variance D(L) for these two sequences is given in 
Fig. El The lower (dashed) line corresponds to the case 
where only the negative antipersistent portion, r < 40, 
of the memory function is allowed for. The upper (dash- 
dot-dotted) curve corresponds to the sequence, which is 
generated by means of the long-range persistent memory, 
F(r) = 0.27r -1,1 , r > 100. It is evident that the gen- 
erated sequence with the antipersistent memory function 
displays the sub-diffusion only, whereas the sequence that 
corresponds to the persistent memory function is charac- 
terized by the super-diffusion behavior of the variance 
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D(L). The difference between the variances for two gen- 
erated sequences and for the original coarse-grained text 
of the Bible, shown by the solid line in the same fig- 
ure, corroborates our assumption about the significance 
of both kinds of the memory function. 




L 



FIG. 6: The variance D(L) for the coarse-grained text of the 
Bible [3l| (the solid line), and for the sequences constructed 
with using the persistent part of the MF (dash-dot-dotted 
line) and the antipersistent one (dashed line). The dotted 
line describes the non-correlated Brownian diffusion, Do(L) = 
La(l — a). 



curves coincide up to the effective memory depth, which 
is proportional to the decimation parameter. A similar 
phenomenon occurs in the case of random decimation as 
well. 




L 



FIG. 7: Numerically calculated variance D(L) for the coarse- 
grained text of the Bible [3ll | (solid line) and for the sequences 
obtained after their regular decimation. Circles, triangles, 
and dots correspond to the decimation parameters 2, 4, and 
8, respectively. The dotted line describes the non-correlated 
Brownian diffusion, Do(L) = La(l — a). The similar curves 
obtained for the sequence constructed by using the long-range 
part of Bible's memory function only are shown in the inset. 



Self-similarity of the coarse-grained texts 



The power-law decrease (without characteristic scale) 
of the memory function at long distances leads to quite an 
essential property of self- similarity of the coarse-grained 
texts with respect to the decimation procedure discussed 
in Ref. |2^. This procedure implies the deterministic 
or random removal of some part of symbols from a se- 
quence and is characterized by the decimation parame- 
ter A < 1 which represents the fraction of symbols kept 
in the chain. For example, under the random decima- 
tion each symbol is eliminated with probability 1 — A. It 
can be shown that both of these procedures, determin- 
istic and stochastic, are equivalent for a Markov chain. 
The sequence is self-similar if its variance D(L) does not 
change after the decimation up to a definite value of L 
(which is dependent on the memory length of the origi- 
nal sequence and the decimation parameter) . The model 
of the additive binary many-step Markov chain with the 
step- like MF (which was discussed in Ref. [HI) offers the 
exact property of self-similarity at the length shorter than 
the memory length N. The coarse-grained literary texts 
have the self-similarity property as well. It is indicated 
in Fig. [3 where three D(L) curves correspond to differ- 
ent values of the parameters of the regular decimation. 
Note that the decimation procedure leads to a decrease 
in the effective memory length. As a result, the variance 



A question arises: what particular property of the 
memory function is crucial for the self-similarity of the 
coarse-grained literary texts. It is natural to assume 
that the persistent long-range scale-free portion of the 
memory function affords this property because the self- 
similarity is specifically manifest at long distances. To 
verify this supposition we carry out the decimation proce- 
dure with different A for the Markov chain constructed by 
using the long-range part of the Bible memory function 
only and then plot the correspondent D{L) dependence. 
The curves are shown in the inset in Fig.[7| It is seen that 
the property of self-similarity for this sequence appears to 
be much more pronounced than for the original coarse- 
grained text of the Bible. Moreover, the antipersistent 
part of the MF disappears very fast after the decimation 
procedure. This is clearly observed as a disappearance of 
the antipersistent sub-linear portion of the D(L) curves 
in Fig. where after decimation the solid line transforms 
into the wholly persistent super-linear curve, which goes 
above the curve Dq = La(l — a). The conclusion about 
the invariance of the statistical properties of studied se- 
quence with respect to the decimation procedure is an 
additional argument in favor of coarse-graining efficiency. 
The decimation can be considered as additional coarse- 
graining of the initial random sequence. 
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CONCLUSION 

Thus, we have demonstrated that the description of 
the literary works is suitable in terms of the Markov 
chains with complex memory functions. Actually, the 
memory function appears to be a convenient informa- 
tive "visiting card" of any symbolic stochastic process. 
We have studied the coarse-grained literary texts and 
shown the complexity of their organization in contrast 
to a previously discussed simple power-law decrease of 
correlations. We have proved that the competition be- 
tween the two kinds of correlations govern the statistical 
properties of the coarse-grained texts. The antipersis- 
tent correlations exist at short distances, L < 300, in the 
region of grammatical rules efficiency. Another kind of 
correlations, persistent one, plays the main role at long 
distances, L > 300. It can be related to the general idea 
of a literary work. Therefore, the first kind of correla- 
tions may be referred to as the grammatical one, whereas 
the second kind may be named as semantic correlations. 
However, the nature of the correlations should be clari- 
fied by linguists. 

If our supposition about the nature of both kinds of 
correlations in the literary texts is correct, several im- 
portant questions will be of great interest, e.g.: 

• Does the lack of the antipersistent portion in 
the memory function (and in the D(L) depen- 
dence |2jj) in the DNA texts mean that the "gram- 
matical rules" are absent in the "DNA language"? 

• If we consider the variance D(L) as a measure of 
information redundancy, can we explain the equal- 
ity D{L)\ DNA ~ 10 • D{L)\ Text resulting from the 
comparison between literary and DNA texts at 
L ~ 3 x 10 5 20] in the following way: the Na- 
ture is more careful about the conservation of the 
information stored in the DNA sequences than the 
Writer in his literary works? 

We have examined the simplest examples of random se- 
quences, the dichotomic one. However, our preliminary 
consideration shows that the presented concept of addi- 
tive Markov chains can by generalized to a larger class of 
random Markov processes with the finite or infinite num- 
ber of states in the discrete or continuous "time". The 
suggested approach can be used for the analysis of other 
correlated systems in different fields of science. 



U. Balucani, M. H. Lee, V. Tognetti, Phys. Rep. 373, 
409 (2003). 

I. M. Sokolov, Phys. Rev. Lett. 90, 080601 (2003). 

A. Bunde, S. Havlin, E. Koscienly-Bunde, H.-J. Schellen- 

huber, Physica A 302, 255 (2001). 



[4 
[5, 

ir 

[8 
[9 

[io; 
in 

[12 
[13 

[14; 

[15 
[16 

[17; 
[is; 

[19 

[20 

[21 

[22 
[23 

[24 

[25' 
[26 
[27 
[28 
[29 
[30 
[31 
[32' 



[33 



[34 



H. N. Yang, Y.-P. Zhao, A. Chan, T.-M. Lu, and G. C. 
Wang, Phys. Rev. B 56, 4224 (1997). 
S. N. Majumdar, A. J. Bray, S. J. Cornell, and C. Sire, 
Phys. Rev. Lett. 77, 3704 (1996). 

S. Halvin, R. Selinger, M. Schwartz, H. E. Stanley, and 
A. Bunde, Phys. Rev. Lett. 61, 1438 (1988). 
R. F. Voss, Phys. Rev. Lett. 68, 3805 (1992). 

H. E. Stanley et al., Physica A 224, 302 (1996). 

S. V. Buldyrev, A. L. Goldberger, S. Havlin, R. N. Man- 
tegna, M. E. Matsa, C.-K. Peng, M. Simons, H. E. Stan- 
ley, Phys. Rev. E 51, 5084 (1995). 

A. Provata and Y. Almirantis, Physica A 247, 482 
(1997). 

R. M. Yulmetyev, N. Emelyanova, P. Hanggi, and F. Ga- 
farov, A. Prohorov, Phycica A 316, 671 (2002). 

B. Hao, J. Qi, Mod. Phys. Lett. 17, 1 (2003). 

R. N. Mantegna, H. E. Stanley, Nature (London) 376, 
46 (1995). 

Y. C. Zhang, Europhys. News, 29, 51 (1998). 

A. Czirok, R. N. Mantegna, S. Havlin, and H. E. Stanley, 

Phys. Rev. E 52, 446 (1995). 

A. Schenkel, J. Zhang, and Y. C. Zhang, Fractals 1, 47 
(1993). 

I. Kanter and D. A. Kessler, Phys. Rev. Lett. 74, 4559 
(1995). 

P. Kokol, V. Podgorelec, Complexity International 7, 1 
(2000). 

W. Ebeling, A. Neiman, T. Poschel, 
arXiv:cond-mat /0204076 

O. V. Usatenko, V. A. Yampol'skii, K. E. Kechedzhy, and 

S. S. Mel'nyk, Phys. Rev. E 68, 06117 (2003). 

H. A. Makse, S. Havlin, M. Schwartz, and H. E. Stanley, 

Phys. Rev. E 53, 5445 (1995). 

W. Li, Europhys. Let. 10, 395 (1989). 

R. F. Voss, in: Fundamental Algorithms in Computer 

Graphics, ed. R. A. Earnshaw (Springer, Berlin, 1985) p. 

805. 

M. F. Shlesinger, G. M. Zaslavsky, and J. Klafter, Nature 
(London) 363, 31 (1993). 

O. V. Usatenko and V. A. Yampol'skii, Phys. Rev. Lett. 
90, 110601 (2003). 

S. Hod and U. Keshet, Phys. Rev. E 70, 015104(R) 
(2004). 

S. L. Narasimhan, J. A. Nathan, and K. P. N. Murthy, 
Europhys. Lett. 69 (1), 22 (2005). 

S. L. Narasimhan, J. A. Nathan, P. S. R. Krishna, and 
K. P. N. Murthy, |arXiv:cond- mat /04090531 
S. S. Melnyk, O. V. U satenko, and V. A. Yampol'skii, 
arXiv: physics/04 12169 

F. M. Izrailev, A. A. Krokhin, and S. E. Ulloa, Phys. 
Rev. B 63, 041102(R) (2001). 

The Old Testament of the King James Version of the 
Bible, http:/ /www.writersbbs. com/bible/ 
At length L of the order of the full text length M ~ 10 6 , 
we observe a violation of the monotonous growth of the 
variance D(L). It is a manifestation of the border effect. 
Obviously, k(L) — > k and the variance D(L) — > (see 
definition Eq. J3J) at L — > M. 

Note that the antipersistent regions in Figs. and ^dif- 
fer from each other; this distinction is connected to the 
difference in the coarse- graining procedures used in these 
two cases. 

Russian Synodal LiO 31/7/91, http: //www. lib.ru/ hris- 
tian /bibliya / nowyj _zawet .txt . 



8 

[35] http://www.lib.ru 



